智能论文笔记

R2C-GAN: Restore-to-Classify GANs for Blind X-Ray Restoration and COVID-19 Classification

Mete Ahishali , Aysen Degerli , Serkan Kiranyaz , Tahir Hamid , Rashid Mazhar , Moncef Gabbouj

分类：计算机视觉 | 机器学习

2022-09-29

恢复质量差的图像与一组混合伪影对于可靠的诊断起着至关重要的作用。现有的研究集中在特定的恢复问题上，例如图像过度，去核和暴露校正，通常对伪影类型和严重性有很强的假设。作为盲X射线恢复的先驱研究，我们提出了一个通用图像恢复和分类的联合模型：恢复分类为分类的生成对抗网络（R2C-GAN）。这种共同优化的模型使恢复后保持任何疾病完整。因此，由于X射线图像质量的提高，这自然会导致更高的诊断性能。为了实现这一关键目标，我们将恢复任务定义为图像到图像的翻译问题，从差异，模糊或暴露不足/暴露不足的图像到高质量的图像域。提出的R2C-GAN模型能够使用未配对的训练样本在两个域之间学习前进和逆变换。同时，联合分类在恢复过程中保留了疾病标签。此外，R2C-GAN配备了操作层/神经元，可降低网络深度，并进一步增强恢复和分类性能。拟议的联合模型对2019年冠状病毒病（COVID-19）分类的卡塔-COV19数据集进行了广泛的评估。拟议的恢复方法达到了90％以上的F1得分，这显着高于任何深层模型的性能。此外，在定性分析中，R2C-GAN的恢复性能得到了一群医生的批准。我们在https://github.com/meteahishali/r2c-gan上共享软件实施。

translated by 谷歌翻译

Early Myocardial Infarction Detection over Multi-view Echocardiography

Aysen Degerli , Serkan Kiranyaz , Tahir Hamid , Rashid Mazhar , Moncef Gabbouj

分类：人工智能 | 计算机视觉 | 机器学习

2021-11-09

心肌梗塞（MI）是世界上死亡率的主要原因，由于饲喂心肌的冠状动脉堵塞。通过促进早期治疗干预措施，MI及其本土化的早期诊断可以减轻心肌损伤的程度。在冠状动脉堵塞后，缺血性心肌细分的区域壁运动异常（RWMA）是最早进入的变化。超声心动图是评估任何RWMA的基本工具。仅从单个超声心动图视图评估左心室（LV）壁的运动可能导致缺少MI的诊断，因为RWMA可能在该特定视图上不可见。因此，在本研究中，我们建议熔化顶端4室（A4C）和顶端2室（A2C）视图，其中可以分析总共11个心肌段的MI检测。所提出的方法首先通过活性多项式（AP）估计LV壁的运动，其提取并跟踪心内膜边界以计算心肌段位移。从A4C和A2C视图位移中提取的特征，该位移融合并馈送到分类器中以检测MI。本研究的主要贡献是1）通过包括A4C和A2C视图的共同分享与研究界的260个超声心动图录制，2）提高了阈值前后工作的性能基于机器学习的方法基于机器的AP，3）通过融合A4C和A2C视图的信息来通过多视图超声心动图进行先驱MI检测方法。实验结果表明，该方法达到了90.91％的敏感性和86.36％的MI检测精度，对多视角超声心动图进行了多视觉检测。

translated by 谷歌翻译

Fully Automated 2D and 3D Convolutional Neural Networks Pipeline for Video Segmentation and Myocardial Infarction Detection in Echocardiography

Oumaima Hamila , Sheela Ramanna , Christopher J. Henry , Serkan Kiranyaz , Ridha Hamila , Rashid Mazhar , Tahir Hamid

分类：计算机视觉 | 机器学习

2021-03-26

被称为超声心动图的心脏成像是一种非侵入性工具，用于生成包括图像和视频的数据，心脏病专家用来诊断心脏异常，尤其是心肌梗死（MI）。超声心动图机可以提供大量数据，需要由心脏病专家快速分析，以帮助他们做出诊断和治疗心脏病。但是，获得的数据质量取决于购置条件以及患者对设置说明的响应能力。这些限制对医生的挑战尤其是当患者面对MI并且他们的生命受到威胁时。在本文中，我们提出了一种基于卷积神经网络（CNN）的创新实时端到端全自动模型，以根据由左心室（LV）的区域壁运动异常（RWMA）检测到MI，该模型是由左心室（LV）的视频中的。超声心动图。我们的模型是由2D CNN组成的管道实现Mi。我们在由165个超声心动图视频组成的数据集上培训了两个CNN，每个CNN从一个独特的患者中获得。 2D CNN在数据分割方面达到了97.18％的精度，而3D CNN获得了90.9％的精度，100％的精度和95％的召回率。我们的结果表明，创建一个完全自动化的MI检测系统是可行且有利的。

translated by 谷歌翻译

Using Active Learning Methods to Strategically Select Essays for Automated Scoring

Tahereh Firoozi , Hamid Mohammadi , Mark J. Gierl

分类：自然语言处理

2023-01-02

Research on automated essay scoring has become increasing important because it serves as a method for evaluating students' written-responses at scale. Scalable methods for scoring written responses are needed as students migrate to online learning environments resulting in the need to evaluate large numbers of written-response assessments. The purpose of this study is to describe and evaluate three active learning methods than can be used to minimize the number of essays that must be scored by human raters while still providing the data needed to train a modern automated essay scoring system. The three active learning methods are the uncertainty-based, the topological-based, and the hybrid method. These three methods were used to select essays included as part of the Automated Student Assessment Prize competition that were then classified using a scoring model that was training with the bidirectional encoder representations from transformer language model. All three active learning methods produced strong results, with the topological-based method producing the most efficient classification. Growth rate accuracy was also evaluated. The active learning methods produced different levels of efficiency under different sample size allocations but, overall, all three methods were highly efficient and produced classifications that were similar to one another.

translated by 谷歌翻译

Word Embedding Neural Networks to Advance Knee Osteoarthritis Research

Soheyla Amirian , Husam Ghazaleh , Mehdi Assefi , Hilal Maradit Kremers , Hamid R. Arabnia , Johannes F. Plate , Ahmad P. Tafti

分类：人工智能 | 机器学习

2022-12-22

Osteoarthritis (OA) is the most prevalent chronic joint disease worldwide, where knee OA takes more than 80% of commonly affected joints. Knee OA is not a curable disease yet, and it affects large columns of patients, making it costly to patients and healthcare systems. Etiology, diagnosis, and treatment of knee OA might be argued by variability in its clinical and physical manifestations. Although knee OA carries a list of well-known terminology aiming to standardize the nomenclature of the diagnosis, prognosis, treatment, and clinical outcomes of the chronic joint disease, in practice there is a wide range of terminology associated with knee OA across different data sources, including but not limited to biomedical literature, clinical notes, healthcare literacy, and health-related social media. Among these data sources, the scientific articles published in the biomedical literature usually make a principled pipeline to study disease. Rapid yet, accurate text mining on large-scale scientific literature may discover novel knowledge and terminology to better understand knee OA and to improve the quality of knee OA diagnosis, prevention, and treatment. The present works aim to utilize artificial neural network strategies to automatically extract vocabularies associated with knee OA diseases. Our finding indicates the feasibility of developing word embedding neural networks for autonomous keyword extraction and abstraction of knee OA.

translated by 谷歌翻译

Benchmarking Spatial Relationships in Text-to-Image Generation

Tejas Gokhale , Hamid Palangi , Besmira Nushi , Vibhav Vineet , Eric Horvitz , Ece Kamar , Chitta Baral , Yezhou Yang

分类：计算机视觉 | 人工智能 | 自然语言处理

2022-12-20

Spatial understanding is a fundamental aspect of computer vision and integral for human-level reasoning about images, making it an important component for grounded language understanding. While recent large-scale text-to-image synthesis (T2I) models have shown unprecedented improvements in photorealism, it is unclear whether they have reliable spatial understanding capabilities. We investigate the ability of T2I models to generate correct spatial relationships among objects and present VISOR, an evaluation metric that captures how accurately the spatial relationship described in text is generated in the image. To benchmark existing models, we introduce a large-scale challenge dataset SR2D that contains sentences describing two objects and the spatial relationship between them. We construct and harness an automated evaluation pipeline that employs computer vision to recognize objects and their spatial relationships, and we employ it in a large-scale evaluation of T2I models. Our experiments reveal a surprising finding that, although recent state-of-the-art T2I models exhibit high image quality, they are severely limited in their ability to generate multiple objects or the specified spatial relations such as left/right/above/below. Our analyses demonstrate several biases and artifacts of T2I models such as the difficulty with generating multiple objects, a bias towards generating the first object mentioned, spatially inconsistent outputs for equivalent relationships, and a correlation between object co-occurrence and spatial understanding capabilities. We conduct a human study that shows the alignment between VISOR and human judgment about spatial understanding. We offer the SR2D dataset and the VISOR metric to the community in support of T2I spatial reasoning research.

translated by 谷歌翻译

What do Vision Transformers Learn? A Visual Exploration

Amin Ghiasi , Hamid Kazemi , Eitan Borgnia , Steven Reich , Manli Shu , Micah Goldblum , Andrew Gordon Wilson , Tom Goldstein

分类：计算机视觉

2022-12-13

Vision transformers (ViTs) are quickly becoming the de-facto architecture for computer vision, yet we understand very little about why they work and what they learn. While existing studies visually analyze the mechanisms of convolutional neural networks, an analogous exploration of ViTs remains challenging. In this paper, we first address the obstacles to performing visualizations on ViTs. Assisted by these solutions, we observe that neurons in ViTs trained with language model supervision (e.g., CLIP) are activated by semantic concepts rather than visual features. We also explore the underlying differences between ViTs and CNNs, and we find that transformers detect image background features, just like their convolutional counterparts, but their predictions depend far less on high-frequency information. On the other hand, both architecture types behave similarly in the way features progress from abstract patterns in early layers to concrete objects in late layers. In addition, we show that ViTs maintain spatial information in all layers except the final layer. In contrast to previous works, we show that the last layer most likely discards the spatial information and behaves as a learned global pooling operation. Finally, we conduct large-scale visualizations on a wide range of ViT variants, including DeiT, CoaT, ConViT, PiT, Swin, and Twin, to validate the effectiveness of our method.

translated by 谷歌翻译

Multi-task Learning for Personal Health Mention Detection on Social Media

Olanrewaju Tahir Aduragba , Jialin Yu , Alexandra I. Cristea

分类：自然语言处理 | 人工智能

2022-12-09

Detecting personal health mentions on social media is essential to complement existing health surveillance systems. However, annotating data for detecting health mentions at a large scale is a challenging task. This research employs a multitask learning framework to leverage available annotated data from a related task to improve the performance on the main task to detect personal health experiences mentioned in social media texts. Specifically, we focus on incorporating emotional information into our target task by using emotion detection as an auxiliary task. Our approach significantly improves a wide range of personal health mention detection tasks compared to a strong state-of-the-art baseline.

translated by 谷歌翻译

Incorporating Emotions into Health Mention Classification Task on Social Media

Olanrewaju Tahir Aduragba , Jialin Yu , Alexandra I. Cristea

分类：自然语言处理 | 机器学习

2022-12-09

The health mention classification (HMC) task is the process of identifying and classifying mentions of health-related concepts in text. This can be useful for identifying and tracking the spread of diseases through social media posts. However, this is a non-trivial task. Here we build on recent studies suggesting that using emotional information may improve upon this task. Our study results in a framework for health mention classification that incorporates affective features. We present two methods, an intermediate task fine-tuning approach (implicit) and a multi-feature fusion approach (explicit) to incorporate emotions into our target task of HMC. We evaluated our approach on 5 HMC-related datasets from different social media platforms including three from Twitter, one from Reddit and another from a combination of social media sources. Extensive experiments demonstrate that our approach results in statistically significant performance gains on HMC tasks. By using the multi-feature fusion approach, we achieve at least a 3% improvement in F1 score over BERT baselines across all datasets. We also show that considering only negative emotions does not significantly affect performance on the HMC task. Additionally, our results indicate that HMC models infused with emotional knowledge are an effective alternative, especially when other HMC datasets are unavailable for domain-specific fine-tuning. The source code for our models is freely available at https://github.com/tahirlanre/Emotion_PHM.

translated by 谷歌翻译

GTFLAT: Game Theory Based Add-On For Empowering Federated Learning Aggregation Techniques

Hamidreza Mahini , Hamid Mousavi , Masoud Daneshtalab

分类：机器学习 | 人工智能

2022-12-08

GTFLAT, as a game theory-based add-on, addresses an important research question: How can a federated learning algorithm achieve better performance and training efficiency by setting more effective adaptive weights for averaging in the model aggregation phase? The main objectives for the ideal method of answering the question are: (1) empowering federated learning algorithms to reach better performance in fewer communication rounds, notably in the face of heterogeneous scenarios, and last but not least, (2) being easy to use alongside the state-of-the-art federated learning algorithms as a new module. To this end, GTFLAT models the averaging task as a strategic game among active users. Then it proposes a systematic solution based on the population game and evolutionary dynamics to find the equilibrium. In contrast with existing approaches that impose the weights on the participants, GTFLAT concludes a self-enforcement agreement among clients in a way that none of them is motivated to deviate from it individually. The results reveal that, on average, using GTFLAT increases the top-1 test accuracy by 1.38%, while it needs 21.06% fewer communication rounds to reach the accuracy.

translated by 谷歌翻译